{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Multi-label deep learning with scikit-multilearn\n",
    "\n",
    "Deep learning methods have expanded in the python community with many tutorials on performing classification using neural networks, however few out-of-the-box solutions exist for multi-label classification with deep learning, scikit-multilearn allows you to deploy single-class and multi-class DNNs to solve multi-label problems via problem transformation methods. Two main deep learning frameworks exist for Python: keras and pytorch, you will learn how to use any of them for multi-label problems with scikit-multilearn. Let's start with loading some data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "emotions:train - exists, not redownloading\n",
      "emotions:test - exists, not redownloading\n"
     ]
    }
   ],
   "source": [
    "import numpy\n",
    "import sklearn.metrics as metrics\n",
    "from skmultilearn.dataset import load_dataset\n",
    "\n",
    "X_train, y_train, feature_names, label_names = load_dataset(\"emotions\", \"train\")\n",
    "X_test, y_test, _, _ = load_dataset(\"emotions\", \"test\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Keras\n",
    "\n",
    "Keras is a neural network library that supports multiple backends, most notably the well-established tensorflow, but also the popular on Windows: CNTK, as scikit-multilearn supports both Windows, Linux and MacOSX, you can you a backend of choice, as described in the backend selection tutorial. To install Keras run:\n",
    "\n",
    "```bash\n",
    "pip install -U keras\n",
    "```\n",
    "\n",
    "### Single-class Keras classifier\n",
    "We train a two-layer neural network using Keras and tensortflow as backend (feel free to use others), the network is fairly simple 12 x 8 RELU that finish with a sigmoid activator optimized via binary cross entropy. This is a case from the [Keras example page](https://keras.io/scikit-learn-api/). Note that the model creation function must create a model that accepts an input dimension and outpus a relevant output dimension. The Keras wrapper from scikit-multilearn will pass relevant dimensions upon fitting. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using TensorFlow backend.\n"
     ]
    }
   ],
   "source": [
    "from keras.models import Sequential\n",
    "from keras.layers import Dense\n",
    "\n",
    "\n",
    "def create_model_single_class(input_dim, output_dim):\n",
    "    # create model\n",
    "    model = Sequential()\n",
    "    model.add(Dense(12, input_dim=input_dim, activation=\"relu\"))\n",
    "    model.add(Dense(8, activation=\"relu\"))\n",
    "    model.add(Dense(output_dim, activation=\"sigmoid\"))\n",
    "    # Compile model\n",
    "    model.compile(loss=\"binary_crossentropy\", optimizer=\"adam\", metrics=[\"accuracy\"])\n",
    "    return model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's use it with a problem transformation method which converts multi-label classification problems to single-label single-class problems, ex. Binary Relevance which trains a classifier per label. We will use 10 epochs and disable verbosity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.42574257425742573"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from skmultilearn.problem_transform import BinaryRelevance\n",
    "from skmultilearn.ext import Keras\n",
    "\n",
    "KERAS_PARAMS = dict(epochs=10, batch_size=100, verbose=0)\n",
    "\n",
    "clf = BinaryRelevance(\n",
    "    classifier=Keras(create_model_single_class, False, KERAS_PARAMS),\n",
    "    require_dense=[True, True],\n",
    ")\n",
    "clf.fit(X_train, y_train)\n",
    "result = clf.predict(X_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Multi-class Keras classifier\n",
    "\n",
    "We now train a multi-class neural network using Keras and tensortflow as backend (feel free to use others) optimized via categorical cross entropy. This is a case from the [Keras multi-class tutorial](https://machinelearningmastery.com/multi-class-classification-tutorial-keras-deep-learning-library/). Note again that the model creation function must create a model that accepts an input dimension and outpus a relevant output dimension. The Keras wrapper from scikit-multilearn will pass relevant dimensions upon fitting. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_model_multiclass(input_dim, output_dim):\n",
    "    # create model\n",
    "    model = Sequential()\n",
    "    model.add(Dense(8, input_dim=input_dim, activation=\"relu\"))\n",
    "    model.add(Dense(output_dim, activation=\"softmax\"))\n",
    "    # Compile model\n",
    "    model.compile(\n",
    "        loss=\"categorical_crossentropy\", optimizer=\"adam\", metrics=[\"accuracy\"]\n",
    "    )\n",
    "    return model"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We use the Label Powerset multi-label to multi-class transformation approach, but this can also be used with all the advanced label space division methods available in scikit-multilearn. Note that we set the second parameter of our Keras wrapper to true, as the base problem is multi-class now."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "from skmultilearn.problem_transform import LabelPowerset\n",
    "\n",
    "clf = LabelPowerset(\n",
    "    classifier=Keras(create_model_multiclass, True, KERAS_PARAMS),\n",
    "    require_dense=[True, True],\n",
    ")\n",
    "clf.fit(X_train, y_train)\n",
    "y_pred = clf.predict(X_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Pytorch\n",
    "\n",
    "Pytorch is another often used library, that is compatible with scikit-multilearn via the skorch wrapping library, to use it, you must first install the required libraries:\n",
    "\n",
    "```bash\n",
    "pip install -U skorch torch\n",
    "```\n",
    "\n",
    "To start, import:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import nn\n",
    "import torch.nn.functional as F\n",
    "from skorch import NeuralNetClassifier"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Single-class pytorch classifier\n",
    "We train a two-layer neural network using pytorch based on a simple example from the [pytorch example page](https://nbviewer.jupyter.org/github/dnouri/skorch/blob/master/notebooks/Basic_Usage.ipynb). Note that the model's first layer has to agree in size with the input data, and the model's last layer is two-dimensions, as there are two classes: 0 or 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_dim = X_train.shape[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SingleClassClassifierModule(nn.Module):\n",
    "    def __init__(\n",
    "        self,\n",
    "        num_units=10,\n",
    "        nonlin=F.relu,\n",
    "        dropout=0.5,\n",
    "    ):\n",
    "        super(SingleClassClassifierModule, self).__init__()\n",
    "        self.num_units = num_units\n",
    "\n",
    "        self.dense0 = nn.Linear(input_dim, num_units)\n",
    "        self.dense1 = nn.Linear(num_units, 10)\n",
    "        self.output = nn.Linear(10, 2)\n",
    "\n",
    "    def forward(self, X, **kwargs):\n",
    "        X = F.relu(self.dense0(X))\n",
    "        X = F.relu(self.dense1(X))\n",
    "        X = torch.sigmoid(self.output(X))\n",
    "        return X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We now wrap the model with skorch and use scikit-multilearn for Binary Relevance classification."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 101,
   "metadata": {},
   "outputs": [],
   "source": [
    "net = NeuralNetClassifier(SingleClassClassifierModule, max_epochs=20, verbose=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {},
   "outputs": [],
   "source": [
    "from skmultilearn.problem_transform import BinaryRelevance\n",
    "\n",
    "clf = BinaryRelevance(classifier=net, require_dense=[True, True])\n",
    "clf.fit(X_train.astype(numpy.float32), y_train)\n",
    "y_pred = clf.predict(X_test.astype(numpy.float32))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Multi-class pytorch classifier"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly we can train a multi-class DNN, this time hte last layer must agree with size with the number of classes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 102,
   "metadata": {},
   "outputs": [],
   "source": [
    "nodes = 8\n",
    "input_dim = X_train.shape[1]\n",
    "hidden_dim = int(input_dim / nodes)\n",
    "output_dim = len(numpy.unique(y_train.rows))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MultiClassClassifierModule(nn.Module):\n",
    "    def __init__(\n",
    "        self,\n",
    "        input_dim=input_dim,\n",
    "        hidden_dim=hidden_dim,\n",
    "        output_dim=output_dim,\n",
    "        dropout=0.5,\n",
    "    ):\n",
    "        super(MultiClassClassifierModule, self).__init__()\n",
    "        self.dropout = nn.Dropout(dropout)\n",
    "\n",
    "        self.hidden = nn.Linear(input_dim, hidden_dim)\n",
    "        self.output = nn.Linear(hidden_dim, output_dim)\n",
    "\n",
    "    def forward(self, X, **kwargs):\n",
    "        X = F.relu(self.hidden(X))\n",
    "        X = self.dropout(X)\n",
    "        X = F.softmax(self.output(X), dim=-1)\n",
    "        return X"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's skorch-wrap it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {},
   "outputs": [],
   "source": [
    "net = NeuralNetClassifier(MultiClassClassifierModule, max_epochs=20, verbose=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/opt/conda/lib/python3.6/site-packages/sklearn/model_selection/_split.py:626: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of members in any class cannot be less than n_splits=5.\n",
      "  % (min_groups, self.n_splits)), Warning)\n"
     ]
    }
   ],
   "source": [
    "from skmultilearn.problem_transform import LabelPowerset\n",
    "\n",
    "clf = LabelPowerset(classifier=net, require_dense=[True, True])\n",
    "clf.fit(X_train.astype(numpy.float32), y_train)\n",
    "y_pred = clf.predict(X_test.astype(numpy.float32))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
